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Abstract 

Background: Over the last decade, the genome-scale metabolic models have been playing increasingly important 
roles in elucidating metabolic characteristics of biological systems for a wide range of applications including, but 
not limited to, system-wide identification of drug targets and production of high value biochemical compounds. 
However, these genome-scale metabolic models must be able to first predict known in vivo phenotypes before it is 
applied towards these applications with high confidence. One benchmark for measuring the in silico capability in 
predicting in vivo phenotypes is the use of single-gene mutant libraries to measure the accuracy of knockout 
simulations in predicting mutant growth phenotypes. 

Results: Here we employed a systematic and iterative process, designated as Reconciling In silico/in vivo mutaNt 
Growth (RING), to settle discrepancies between in silico prediction and in vivo observations to a newly reconstructed 
genome-scale metabolic model of the fission yeast, Schizosaccharomyces pombe, SpoMBEL1693. The predictive 
capabilities of the genome-scale metabolic model in predicting single-gene mutant growth phenotypes were 
measured against the single-gene mutant library of 5. pombe. The use of RING resulted in improving the overall 
predictive capability of SpoMBEL1693 by 21.5%, from 61.2% to 82.7% (92.5% of the negative predictions matched 
the observed growth phenotype and 79.7% the positive predictions matched the observed growth phenotype). 

Conclusion: This study presents validation and refinement of a newly reconstructed metabolic model of the yeast 
5. pombe, through improving the metabolic model's predictive capabilities by reconciling the in silico predicted 
growth phenotypes of single-gene knockout mutants, with experimental in vivo growth data. 
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Background 

Genome-scale metabolic models have proven themselves 
in a wide range of applications in the field of biotechnol- 
ogy, such as system-wide drug targeting, metabolic en- 
gineering of microbial systems for production of various 
chemicals and materials, and system-wide understanding 
of cellular metabolism [1-5]. Although a large majority of 
these genome-scale metabolic models are of prokaryotic 
organisms, genome-scale metabolic models of eukaryotic 
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organisms exist and have contributed in the study of 
eukaryotic metabolism [2,6]. For instance, the human 
genome-scale metabolic model has been employed in 
the study of Alzheimer's disease, giving insight into 
the disease and suggesting potential treatments [7]. 
Other eukaryotic genome-scale metabolic models, in 
addition to Homo sapiens [8,9], include Mus musculus 
[10], Leishmania major [11], Aspergillus nidulans [12], 
Aspergillus niger [13], Saccharomyces cerevisiae [6,14], 
and Pichia pastoris [15]. 

However, eukaryotic genome-scale metabolic models 
are far from being complete due to the complexity of 
eukaryotic systems, such as the presence of intracellular 
organelles, requiring compartmentalization of the metab- 
olism, and a more complex regulation and gene expres- 
sion network than bacterial systems. To ensure that the 
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metabolic model can accurately represent the biological 
system of interest, the predictive capabilities of the meta- 
bolic model is compared against experimental data as a 
means of validating the metabolic model This standard 
in evaluating metabolic models is applied to different 
conditions for which data are available [16]. Discrepan- 
cies between the predictions made by the metabolic 
model, or in silico predictions, and experimental results 
are used to direct revisions to the genome-scale meta- 
bolic model to improve its predictive capabilities [17-19]. 
Here we present a strategy in improving the predictive 
capabilities of genome-scale metabolic model of single- 
gene knockout growth phenotypes through Reconcili- 
ation of In silicolin vivo mutaNt Growth (RING) with 
the single-gene knockout mutant library and apply RING 
on the newly reconstructed genome-scale metabolic 
model of the fission yeast Schizosaccharomyces pombe to 
validate and improve the metabolic model. 

The fission yeast S. pombe is widely used as a model 
system for studying eukaryotic systems in life science re- 
search [20,21]. This yeast is also gaining acceptance in 
biotechnology as a cell factory platform in industrial 
applications [22]. It possesses a relatively small genome 
size for a eukaryote, 13.8 Mbp distributed over 3 chro- 
mosomes [20]. Genome studies of the yeast have identi- 
fied fifty genes homologous to human genes, acquiring 
interest from biomedical research [20]. Furthermore, its 
unique cell cycle characteristics compared to other 
yeasts {e.g., cell division through medial fission instead of 
budding) make it an ideal model in the studying mam- 
malian cell cycle control. A high percentage of the re- 
search on S. pombe is dedicated to understanding cell 
cycle control in S. pombe, as well as other cellular func- 
tions, such as DNA repair and cellular maintenance. 
Little research on the metabolism of S. pombe is 
found beyond the catabolism of substrates other than 
glucose, ethanol production and even less on the 
metabolic engineering of S. pombe. With the introduc- 
tion of a genome-scale metabolic model of S. pombe 
validated with RING, research into the metabolism of 
this yeast will gather momentum. 

The genome-scale metabolic model of S. pombe, 
SpoMBEL1693, consists of 1693 metabolic reactions and 
1744 metabolites, distributed among 8 different compart- 
ments representing the intracellular organelles. Employ- 
ing the single-gene knockout mutant library of S. pombe, 
RING was applied to improve and refine SpoMBEL1693 
to accurately represent the metabolic network of S. 
pombe [23]. Initial in silico predictions compared to the 
single-gene knockout mutant library resulted in a 61.2% 
of all predictions correctly reflected the observed pheno- 
types (41.4% of the predicted lethal phenotypes and 
65.4% of the predicted viable phenotypes matched with 
their respective observed in vivo growth phenotypes). 



After analysis and reconciliation of the false predic- 
tions, SpoMBEL1693 was updated and the accuracy 
was improved to 82.6% of all the predictions of the 
single-gene knockout mutant growth phenotypes 
matched the observed in vivo phenotype (92.5% of the 
predicted lethal phenotypes and 79.6% of the pre- 
dicted viable phenotypes matched with their respective 
observed in vivo phenotypes). 

Results 

Here the strategy for reconciling differences between in 
silico predictions and in vivo observations (RING) is ap- 
plied to validate and upgrade the first reconstruction of 
the genome-scale metabolic model of S. pombe, SpoM- 
BEL1693. The ability of this newly reconstructed meta- 
bolic model to represent the metabolic physiology of this 
yeast was analyzed by comparing the growth phenotypes 
obtained by single gene knockout simulations with those 
experimentally observed for the single-gene knockout 
mutant library [23]. Using RING, the discrepancies be- 
tween in silico predictions and in vivo observations were 
systematically and iteratively resolved. The overall 
scheme for the process can be seen in Figure 1. 

In silico growth phenotypes for the deletion of every 
metabolic reaction were generated and the respective 
genes associated to each metabolic reaction were identi- 
fied. These growth phenotypes were then categorized as 
either positive or negative (viable or lethal) with a viabil- 
ity threshold of 10% of the "wild- type" growth rate. Then 
the in vivo phenotypes for each gene was then retrieved 
from the single-gene knockout mutant library publically 
available [23]. Once the in vivo phenotypes were 
retrieved and compared against the in silico predictions, 
the growth phenotypes were then further categorized 
based on whether the predictions matched the in vivo 
observations (True or False). The false predictions were 
then sorted and analyzed in a step-wise manner outlined 
in Figure 1 until all predictions were examined. 

The iterative manner, with which RING was employed, 
was to ensure that the changes made to SpoMBEL1693 
to reconcile the discrepancies, do not alter other results 
and negatively affect the overall accuracy, defined as the 
number of correct in silico predictions over the total 
number of predictions made, of the metabolic model. By 
reconciling discrepancies between in silico prediction 
and in vivo data, the genome-scale metabolic model was 
able to accurately represent the metabolic characteristics 
of S. pombe. Simulations were performed under YES 
media conditions and the knockout results were categor- 
ized as either positive or negative, where the positive repre- 
sents viable phenotype for a given knockout and the 
negative represents a lethal phenotype for that knockout. 
When compared against the published results obtained 
with mutant library, the results are categorized as either 
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Figure 1 Diagram of the overall scheme of RING (Reconciliation of In silico/in vivo mutaNt Growth) divided into two stages. In the top 

stage (Blue Box), the reconstruction of the genome-scale metabolic model and the single-gene mutant growth simulations from the 5. pombe 
genome is performed. Also in this stage is the generation of the single-gene mutant knockout library which was imported from Kim et al. [23]. In 
the second stage (Red Box), the analysis and reconciliation of the discrepancies between in silico prediction and in vivo observation is performed 
in an iterative manner. Once all possible strategies in reconciling the differences between the in silico predictions and in vivo observations are 
exhausted based on current information and knowledge available, the end result is an updated genome-scale metabolic model. 



true/false positives or true/false negatives according to 
whether the prediction agrees with the in vivo results. True 
results indicate that the in silico predictions match with the 
in vivo results and false results indicate a discrepancy be- 
tween the two. A false positive indicates that SpoM- 
BEL 1693 predicts a viable phenotype while the in vivo 
result shows a lethal phenotype (Table 1). A false negative 
result represents that the SpoMBEL1693 predicts a lethal 
phenotype while the in vivo result shows a viable pheno- 
type (Table 1). Analysis of false predictions via RING high- 
lights gaps in the knowledge of the metabolism of S. pombe 
and leads to improvements to the metabolic model by rec- 
onciling these differences between the in silico prediction 
and in vivo observations. 

Metabolic model characteristics 

The metabolic model of S. pombe, SpoMBEL1693, consists 
of 1693 metabolic reactions, including 386 transport and 
exchange reactions, and 1744 metabolites. The metabolic 
model is divided into 8 different compartments to repre- 
sent the different organelles in S. pombe: cytoplasm, mito- 
chondria, nucleus, peroxisome, endoplasmic reticulum, 



golgi apparatus, vacuole and the extracellular environment 
(Additional file 1). The metabolic reactions were taken 
from the Kyoto Encyclopedia of Genes and Genomes [24], 
NCBI, and supplemented with information in the S. pombe 
gene database on GeneDB [25]. Compartmental assign- 
ment of the reactions was based on the reports in which 
protein localization experiments were performed [26,27]. 
The total gene coverage of the metabolic model is 605 
genes out of 4940 protein-coding genes. 

An important metabolic reaction in SpoMBEL1693 is the 
biomass formation reaction. This "pseudo" metabolic reac- 
tion is used to represent the synthesis of cellular biomass, 
or cell growth. Construction of the biomass reaction 
involves the accumulation of all important components ne- 
cessary for biomass formation with the coefficients deter- 
mined through both experimental measurements and data 
present in the literature. The biomass reaction is particu- 
larly important in our analysis as it is employed to indicate 
whether a metabolic reaction and their respective genes are 
essential for growth. Detailed information in the construc- 
tion of the biomass reaction can be found in the methods 
and in Additional file 2. To validate the reconstruction of 
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Table 1 Terms and definitions used in the analysis 
between in silico predictions and in vivo observations 



Term 


Definition 


Dp^rrintinn 


True postive (TP) 


In silico: viable 
phenotype 

In vivo: viable 
phenotype 


Results where the metabolic 
model predicts a viable 
phenotype and the actual 
phenotype is also viable 


True negative (TN) 


In silico: lethal 
phenotype 
In vivo: lethal 
phenotype 


Results where the metabolic 
model predicts a lethal 
phenotype and the actual 
phenotype is also lethal 


False positive (FP) 


In silico: viable 
phenotype 
In vivo: lethal 
phenotype 


Results where the metabolic 
model predicts a viable 
phenotype but the actual 
phenotype is lethal 


False negative (FN) 


In silico: lethal 
phenotype 
In vivo: viable 
phenotype 


Results where the metabolic 
model predicts a lethal 
phenotype but the actual 
phenotype is viable 


Overall accuracy 


(TP + TN)/ 

(TP + TN + FP + FN) 


Percentage of correct 
' predictions by the metabolic 
model 


Negative prediction 
rate 


(TN)/(TN + FN) 


Percentage of negative 
predictions that are correctly 
predicted as lethal 


Positive prediction 
rate 


(YPWP + FP) 


Percentage of positive 
predictions that are correctly 
predicted as viable 


Sensitivity 


(TP)/(TP + FN) 


Percentage of viable 
predictions correctly 
predicted as positive 


Specificity 


(TN)/(TN + FP) 


Percentage of lethal 
predictions correctly 
predicted as negative 



this metabolic model, the in silico single knockout simula- 
tions was measured against the single-gene knockout mu- 
tant library through the use of the RING strategy and will 
be discussed in detailed here. Furthermore, additional valid- 
ation of the metabolic model was done by comparing the 
metabolic models capability in utilizing various carbon 
sources and production of ethanol at different dilution rates 
(See Additional file 3). 

Gene/reaction essentiality simulation 

Gene knockout simulations were performed to evaluate 
the capability of the metabolic model to predict growth 
phenotypes of S. pombe. The impact of each metabolic 
reaction and its respective gene on the growth phenotype 
was investigated using the metabolic model As a result, 
198 essential metabolic reactions corresponding to 84 
genes were identified (Additional file 4). Transport reac- 
tions and metabolic reactions for which no gene 
assignment or experimental data were available were not 
included in the analysis. However, duplicate metabolic 



reactions in different compartments were included and 
this accounts for the large difference in number of meta- 
bolic reactions and genes. It should be noted that the in 
silico simulation of the genome-scale metabolic model 
was based solely on the stoichiometry of the metabolic 
reactions, while the regulatory, signaling or other inter- 
active information was not included. 

Lethal genes were determined by observing the change 
in the in silico growth rate when the corresponding 
metabolic reaction was removed from the model, repre- 
senting the deletion of its respective genes. If the cell 
growth rate dropped to zero or less than 10% of the ori- 
ginal "wild-type" growth rate, the resulting phenotype 
was classified as lethal and the reaction and its respect- 
ive genes were considered to be essential. When no 
change to the in silico growth rate was observed or 
remained greater than 10% of the "wild-type" growth 
rate, the metabolic reaction and its respective genes 
were determined to be non-essential, as the resulting 
phenotype is viable. The RING analysis was performed 
in an iterative manner where the metabolic model was 
revised based on the analysis of the comparison between 
the results of in silico knockout simulation and those ex- 
perimentally observed with single-gene knockout library 
[23]. 

Resolution and analysis of false positive predictions 

False results indicate that information is absent or incor- 
rect in the metabolic model resulting in a discrepancy 
with what is observed in vivo. Thus, these false results 
must be resolved through adding missing or correcting 
erroneous information such that the in silico predictions 
match the observed in vivo phenotypes. In this section 
we will examine the different cases for which false posi- 
tive prediction arises and strategies to resolve these dis- 
crepancies. A false positive prediction indicates that a 
viable phenotype is incorrectly predicted by the meta- 
bolic model when a metabolic reaction (and by associ- 
ation, its corresponding gene) is deleted. Analysis of the 
initial positive, or viable, predictions of mutant pheno- 
types of SpoMBEL1693 resulted in 65.4% of the positive 
predictions matching the observed in vivo phenotypes 
(296 false positives and 560 true positives) (Figure 2A). 
Strategies in resolving these inconsistencies through 
RING analysis are summarized in Figure 3A and are out- 
lined in this section. The different strategies were imple- 
mented in stages to systematically analyze the false 
positive predictions. 

The first step in reconciling false positive predictions the 
identification of all duplicated or redundant metabolic reac- 
tions localized in a different compartment of the metabolic 
network. The presence of these redundant metabolic reac- 
tions are the result of localization data placing the respect- 
ive proteins in these compartment and as a result provides 
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True Negative 
False Negative 
True Positive 
False Positive 



Negative Prediction 41.4% Negative Prediction 61.7% Negative Prediction 92.5% 
Positive Prediction 65.4% Positive Prediction 74.3% Positive Prediction 79.6% 
Overall Prediction 61.2% Overall Prediction 70.5% Overall Prediction 82.6% 

Figure 2 Summary of the metabolic reaction in their categories from the results of the in silico single-gene mutant prediction for 
SpoMBEL1693. A) Initial results and percentages of SpoM BEL 1693 on predicting single-gene mutant growth phenotype. B) Improved rates in 
true predictions by SpoMBEL1693 after one iteration of RING. C) Final results and percentages of SpoMBEL1693 on predicting single-gene mutant 
growth phenotype after updating the metabolic model to resolve discrepancies between in silico prediction and in vivo data. Percentages were 
calculated by the number of true predictions over the total number of predictions for each group. 



an alternate route through another cellular compartment 
(Figure 3 A Case 1). Localization data can also place an en- 
zyme in another compartment but with no other enzymes 
that would balance the generation or consumption of the 
metabolites (orphan reaction). Knockout of this reaction 
would give a false positive if the gene were to be essential 
and the duplicate metabolic reaction in the functional com- 
partment a true negative prediction. A total of 41 metabolic 
reactions fall under this category and when resolved were 
reclassified under the negative predictions. For instance, 
many of the metabolic reactions have had their respective 
proteins localized in the nucleus isolated from other meta- 
bolic reactions in clusters or as individuals but no complete 
pathways, such as the first two steps into lower glycolysis, 
nicotinate metabolism and pentose metabolism. To validate 
the essentiality of the genes, all instances of the encoding 
metabolic reactions were deleted simultaneously. 

Metabolic reactions with false positive predictions were 
then checked for their connectivity to the metabolic net- 
work. Analysis of the connectivity of these metabolic 
reactions showed that false predictions were also corre- 
lated to dead end metabolic reactions in pathways which 
are not connected at the downstream end, but con- 
nected at the upstream end (dead end reactions) and 
non-redundant orphan metabolic reactions. The or- 
phan metabolic reactions (Figure 3A Case 2) account 
for 31 metabolic reactions in SpoMBEL1693, and in- 
clude metabolic reactions that charge tRNA with 
amino acids to be used for protein synthesis. How- 
ever, tRNA compositions have already been incorpo- 
rated into the biomass formation reaction, making 
these metabolic reactions redundant and therefore 
were removed from the analysis, but retained in the 
metabolic model. 



Metabolic reactions in dead end pathways were recon- 
ciled by connecting the ends of the pathways to the meta- 
bolic network (Figure 3A Case 3). In the extreme instance 
where linking the metabolic pathway to the metabolic net- 
work failed to resolve the false positive prediction, the 
major downstream metabolite was incorporated into the 
biomass metabolic reaction representing cellular growth, 
directly linking the metabolic pathway to cellular growth. 
The heme biosynthetic pathway is one example of this 
case. Heme showed no metabolic role or function in the 
metabolic model, resulting in false positive results in the 
knockout simulation However, the genes encoding for the 
metabolic reactions of the heme biosynthesis pathway 
were found to be essential for growth according to the 
single-gene mutant library as evidenced by the lethal 
phenotype displayed in knockouts of genes in heme bio- 
synthesis. Thus, heme was incorporated into the biomass 
metabolic reaction with a coefficient calculated with a neg- 
ligible cellular concentration to prevent any drain of cellu- 
lar resources by heme biosynthesis. By incorporating heme 
into the biomass metabolic reaction, the biosynthesis of 
heme becomes linked to cellular growth. A consequence 
of linking heme to biomass is the inclusion of iron ions 
into the YES media. Sterol biosynthesis is one instance 
where linking the metabolic pathways to the rest of the 
network was sufficient for resolving false positive predic- 
tions. Gaps in the metabolic pathway of sterol biosynthesis 
were filled (SPBC1709.07 and SPBC16E9.05) and con- 
firmed through GeneDB to resolve the false positive pre- 
dictions. A total of 37 metabolic reactions with false 
positive predictions were resolved and re- categorized as 
true negatives. 

The gene associations to metabolic reactions were then 
examined to reconcile false positive prediction from the 
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Figure 3 Diagram of possible causes and solutions for false predictions. Circles represent metabolites and arrows represent metabolic 
reactions A) false positive predictions B) false negative predictions. Red arrows indicate problems in the network and green arrows indicate 
possible solutions. Red Xs indicate knockout of the reaction and green Xs are knockouts that would reconcile the issue to achieve the correct 
prediction. For false negative predictions where the reaction associated with multiple genes, the gene in red indicates that the gene is knocked 
out or with uncertain metabolic essentiality (metabolically non-essential but essential in another capacity) and the genes in green indicate 
additional knockouts that can potentially resolve the false prediction. 



knockout simulation. One instance of this case is the as- 
sociation of multiple metabolic reactions to a single gene 
(Figure 3 A Case 4). Enzymes encoded by a gene have 
been known to participate in multiple functionalities in 
the metabolic network, and as a result, multiple meta- 
bolic reactions in the metabolic model are associated 
with the same gene. Hence, deletion of just one of the 
metabolic reactions does not accurately reflect the single 
gene knockout of the respective gene. To resolve this, 
all metabolic reactions associated to the target gene 
were deleted simultaneously. With the metabolic reac- 
tions simultaneously deleted, such false positive predic- 
tion was resolved and a lethal phenotype was predicted. 
Sixty-four metabolic reactions were reconciled in this 
manner (Figure 2B). 



The remaining false positive predictions were those 
that could not be reconciled in RING, due to lack of the 
information available regarding the metabolic network. 
Sixty-two metabolic reactions with false positive predic- 
tions showed no flux in the in silico wild-type flux distri- 
bution, indicating that these metabolic reactions are not 
used for growth, despite the fact that the deletion of their 
corresponding genes gives a lethal phenotype in vivo. 
The absence of any flux through these 62 metabolic reac- 
tions could be attributed to the lack of regulatory infor- 
mation that would direct the flux through that metabolic 
reaction. Thirty seven metabolic reactions that showed 
false predictions were not reconciled with high confi- 
dence due to the simultaneous assignment of both viable 
and lethal genes to the metabolic reactions. Eight of the 
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37 metabolic reactions overlap with the previous cat- 
egory where the metabolic reactions exhibit no flux in 
SpoMBEL1693. The remaining 29 metabolic reactions 
are utilized and exhibit fluxes when the growth rate is 
maximized. However, there is no indication whether the 
deletion of the reaction results in a lethal phenotype or 
the lethal gene(s) functions in another capacity that is es- 
sential for growth, but not reflected in the metabolic net- 
work. Therefore, to resolve these cases with high 
confidence, detailed characterization of all the genes 
associated to the metabolic reaction is needed. Overall, 
the correct prediction rate of viable phenotype was 
improved to 79.6% (61 false positive and 561 true posi- 
tive predictions) (Figure 2C) after RING was applied 
(Additional file 4). 

Resolution and analysis of false negative predictions 

False negative predictions are results where the growth 
phenotype is predicted to be lethal, but instead is viable ex- 
perimentally. Initial negative prediction rate was 41.4% (55 
false negative and 39 true negative predictions) (Figure 2A). 
These false negative predictions were also analyzed in 
stages and reconciled through RING (Figure 3B). 

Analysis of false negative predictions started with 
the examination of the genes associated to the meta- 
bolic reactions with false negative predictions. The 
large majority of false negative metabolic reactions 
were found to have multiple genes associated with the 
metabolic reactions (Figure 3B Case 1). Eleven of the 
metabolic reactions were associated with both viable 
and lethal genes and 25 metabolic reactions were 
associated with only viable genes. Reconciling the false 
prediction of these metabolic reactions could not be 
resolved due to insufficient information regarding the 
functional roles these genes play in the metabolic 
reactions. For example, in metabolic reactions asso- 
ciated with both lethal and viable genes, it is possible 
that the viable gene is a minor or non-essential con- 
tributor to the functional performance of the meta- 
bolic reaction. Also, for metabolic reactions with 
multiple viable genes associated, it is possible that they 
perform an auxiliary role to each other and can func- 
tionally replace the other when that gene is deleted. In 
this instance, all genes associated to the metabolic reac- 
tion would have to be deleted to confirm essentiality of 
the reaction. 

Another instance of Case 1 is where all the genes 
associated with the metabolic reaction are viable; it is 
also uncertain if the metabolic reaction is essential to 
the metabolic network (true negative) or if the nega- 
tive prediction is indeed a false prediction. If the 
metabolic reaction is truly essential to the metabolic 
network, then the knockout of all the genes that are 
associated with the metabolic reaction would give the 



lethal phenotype when predicted using SpoMBEL1693. 
Single-gene knockout mutants for these genes would 
not be sufficient in suppressing the metabolic reaction 
as it would be compensated by the presence of alter- 
nate genes that can function in place of the deleted 
gene. Due to the lack of information that would allow 
for the reconciling of these false predictions, the 
metabolic reactions were removed from the analysis 
and noted for future research. 

The remaining false negative predictions were exam- 
ined to determine if the metabolic reactions affected 
the biosynthesis of biomass components for cellular 
growth. In this case, an alternate metabolic reaction is 
needed to resolve this false prediction (Figure 3B Case 
2). If a metabolic reaction is the only source of an es- 
sential metabolite (i.e. an essential intermediate neces- 
sary for the biosynthesis of biomass components), 
strategies were investigated to supply the essential me- 
tabolite from other sources within the metabolic net- 
work (e.g. another compartment). For example, in the 
cytoplasm, acetyl-CoA was produced only through the 
metabolic reaction represented by the enzyme Acetyl- 
CoA synthetase, which is a non-essential enzyme for 
growth based on the single-gene knockout mutant li- 
brary. However, knockout simulations show that 
acetyl- Co A in the cytoplasm is essential for growth, a 
precursor to the synthesis of biomass components. 
Thus, an alternate pathway that can produce acetyl- 
CoA is needed in the cytoplasm. Alternate metabolic 
reactions capable of producing acetyl-CoA were found 
in the mitochondria. However, localization data of the 
metabolic enzymes in S. pombe does not support the 
presence of the corresponding metabolic reactions in 
the cytoplasm [27]. Thus, to allow the cytoplasm 
compartment access to the acetyl-CoA produced in 
the mitochondria, the exchange reaction for acetyl- 
CoA between the mitochondria and the cytoplasm 
was added to confirm that a viable phenotype can be 
attained (Figure 3B Case 2). The addition of this ex- 
change reaction resulted in a viable phenotype and 
suggests the presence of an acetyl-CoA transport from 
the mitochondria to the cytosol. Direct transport of 
acetyl- Co A between the intracellular compartments is 
not possible due to the compounds bulkiness and 
amphiphilic nature [28], therefore, the S. pombe gen- 
ome was searched for a carnitine-acetyl-CoA shuttle 
that has been reported in S. cerevisiae (CAT2, YAT1 
and YAT2). However, a search through the genome 
annotation and a BLAST search for the carnitine- 
acetyl-CoA shuttle in S. pombe resulted in no candi- 
dates. Due to the lack of any possible candidates as a 
transport protein for acetyl-CoA across the mitochon- 
drial membrane and the improbability of a direct 
transport of acetyl-CoA, the inconsistency of acetyl- 
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CoA synthetase remained unresolved. The remaining 
16 metabolic reactions were unable to be reconciled 
due to insufficient information. After RING analysis 
of false negative predictions, the reconciliation be- 
tween in silico and in vivo phenotypes resulted in the 
improvement of the correct prediction rate from 
41.4% to 92.5% of the negative predictions matching 
the observed in vivo phenotypes (17 false negative 
predictions and 198 true negative predictions) 
(Figure 2C). 

Comparative analysis of the yeast metabolic models 

The predictive capability of the S. pombe genome-scale 
metabolic model was compared to the predictive capabil- 
ity of another yeast metabolic model that has been 
reconstructed, S. cerevisiae /MM904 [17,18]. /MM904 
was employed for similar studies in predicting the in 
silico growth phenotypes and was used as a basis for 
eukaryotic metabolic models prediction capability of 
mutant growth phenotypes [18]. First, the overall meta- 
bolisms of the two yeasts were examined with compart- 
mental assignment of duplicate metabolic reactions 
ignored in both yeasts, with the exception of metabolic 
reactions where the localization of these reactions was 
distinctly different. One distinct difference between S. 
pombe and S. cerevisiae is the lack of metabolic reactions 
localized in the peroxisome, due to the scarcity of know- 
ledge on peroxisome in the fission yeast, highlighting the 
need for additional studies into peroxisomal metabolism 
in S. pombe [18,29]. The central metabolic network be- 
tween the two yeasts displayed little variability in the 
structure of the metabolic network, with the exception 
of the absence of the glyoxylate shunt in S. pombe. 

The results of the analysis of SpoMBEL1693 to predict 
mutant growth phenotypes were compared to those 
obtained with the S. cerevisiae metabolic model /MM904 
[18]. In the analysis of /MM904, the statistical classifica- 
tion function, specificity and sensitivity, were employed 
in the analysis of the essentiality simulation to represent 
the proportion of negative and positive (lethal and viable) 
phenotypes correctly predicted as negative and positive, 
respectively (Table 1). In other words, specificity repre- 
sents the proportion of negative phenotypes that were 
correctly predicted to be negative by the metabolic 
model (TN:TN + FP). Sensitivity is defined the same ex- 
cept that it looks at the proportion of positive pheno- 
types correctly predicted to be positive by the metabolic 
model (TP:TP + FN). The specificity of 53.6% and sensi- 
tivity of 99.1% were achieved using /MM904 [18]. For 
comparison, the specificity and sensitivity in predicting 
the phenotypes of single-gene knockout mutants using 
SpoMBEL1693 were calculated. A higher specificity of 
76.4% and a comparable sensitivity of 97.1% were 
obtained with SpoMBEL1693. A false viable rate, FP/ 



(FP + FN), or the ratio of false predictions that have been 
experimentally observed to be lethal, was also calculated 
for /MM904 and compared with that obtained with 
SpoMBEL1693. The false viable rate obtained with 
SpoMBEL1693 (23.5%) was lower than that (46.4%) 
obtained with /MM904 (Figure 4). The specificities of 
other metabolic models, for which essentiality analysis 
was performed, were also calculated. It was found that 
the specificity of SpoMBEL1693 was similar to four of 
the seven metabolic models (70-80%), and of the 
remaining three, only one had a higher specificity than 
SpoMBEL1693 (Figure 4). The metabolic model of the 
extensively studied bacterium Escherichia coli, /AF1260, 
was listed to have a specificity of 73.4%, placing the S. 
pombe metabolic model on the same level of perform- 
ance with this bacterium in predicting mutant growth 
phenotypes. 

With the S. pombe genome-scale metabolic model 
improved through RING, its metabolic capabilities were 
examined and compared to the metabolic capabilities of 
the S. cerevisiae genome-scale metabolic model. The 
maximum in silico mol yield of 4 different metabolites, 
which have been targeted in the past metabolic engineer- 
ing (acetate, ethanol, lactate and succinate), was deter- 
mined for each yeast using their respective genome-scale 
metabolic models (SpoMBEL1693 and /MM904). Results 
show a difference in maximum in silico yield for the 
metabolites acetate and lactate and no difference in the 




Sensitivity 99.1% Sensitivity 97.1% 

Specificity 53.6% Specificity 76.4% 

False Viable Rate 46.4% False Viable Rate 23.5% 

Strain (Model name) Specificity 

Escherichia coli (/AF1 260) 73.4% 
Mycoplasma genitalium (/PS189) 79.0% 
Bacillus subtilis (#BSU1 1 03) 89.3% 
Pseudomonas putida (/JP81 5) 74.5% 
Helicobacter pylori (/IT341 ) 73.0% 
Salmonella typhimurium (/MA945) 66.7% 



Figure 4 Comparison of the performance and composition of 
the S. cerevisiae metabolic model iMM904 and the S. pombe 
metabolic model SpoMBEL1693. The numbers in the Venn 
Diagram indicate the number of reactions in the metabolic models. 
Specificity = TN/(TN + FP), Sensitivity = TP/(TP + FN), False Viable 
Rate = FP/(FP + FN), TN = True Negative, TP = True Positive, FN = False 
Negative, FP = False Positive. 

V J 
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yields for ethanol and succinate (Table 2). Simulations 
show that S. pombe has a higher yield in producing lac- 
tate than S. cerevisiae (approximately 15% less than S. 
pombe) suggesting that S. pombe would be a more ideal 
host for producing lactate from glucose. With acetate, S. 
pombe shows a slightly lower yield than in S. cerevisiae, 
which is an advantage for S. pombe as acetate is com- 
monly found as a metabolic by-product. Furthermore, 
the lower acetate yield may also be a reflection of the ab- 
sence of acetate during the aerobic ethanol fermentation 
in S. pombe, whereas acetate was observed in S. cerevi- 
siae [30]. 

Discussion 

Here we reported the validation and improvement of the 
newly reconstructed genome-scale metabolic model of a 
fission yeast S. pombe, SpoMBEL1693, presented here for 
the first time. The experimental results imported from 
the publically available single-gene knockout mutant li- 
brary were utilized to improve the accuracy of SpoM- 
BEL1693 in predicting mutant growth phenotypes. The 
strategy designated as RING, was employed to identify 
and reconcile discrepancies between the in silico predic- 
tion results and the experimental results. Iterative appli- 
cation of RING resulted in a step-wise improvement in 
the accuracy of the genome-scale metabolic model of 
this less studied yeast. The first iteration of RING 
resulted in improving the overall accuracy by 9% (61.2% 
to 70.5%). The second iteration was then performed and 
further increased the accuracy by another 12.2% (70.5% 
to 82.7%). 

Previous studies have been done in reconciling differ- 
ences between in silico prediction and in vivo observa- 
tions of mutant phenotypes [17,18,31]. In the recent 
study with /MM904, GrowMatch was employed to re- 
solve discrepancies between the in silico predictions and 
in vivo observation [18]. Here the gene-protein-reaction 
(GPR) relationship was employed to simulate the gene 
knockout. However, GrowMatch also suggested several 
suppression strategies that went beyond the knockout of 
single genes to resolve inconsistencies for a single-gene 
mutant phenotype [32]. With the full GPR relationship 
in S. pombe not fully characterized for most of the meta- 
bolic reactions, it was decided that a direct metabolic 



Table 2 Maximum in silico molar yields of various 
metabolites 3 



Metabolites 


Schizosaccharomyces pombe Saccharomyces cerevisiae 


Acetate 


2.55 


2.62 


Ethanol 


2 


2 


Lactate 


2 


1.73 


Succinate 


1.5 


1.5 



a) Molar yields are in (mol metabolites/mol glucose). 



reaction knockout would be more suitable in simulating 
the mutant metabolic phenotype as opposed to the 
knock out the metabolic reaction through uncertain GPR 
relationships for metabolic reactions. Furthermore, with- 
out the constraint of a preconceived GPR relationship, 
information into the actual GPR relation of the genes, 
proteins and reactions can be illuminated. 

Analysis of the false predictions has identified a num- 
ber of areas for which insufficient information is avail- 
able to improve the accuracy of the metabolic model in 
predicting growth phenotypes of single-gene knockout 
mutants. For instance, the case where multiple genes are 
associated with a single metabolic reaction was dis- 
cussed. In some instances, both viable and lethal genes 
are associated whereas in other instances multiple viable 
genes are associated to a reaction that is predicted to be 
essential in the metabolic network. Further experimental 
data on these metabolic reactions and their correspond- 
ing genes will provide hints on how to resolve these 
issues and further improve the representation of the 
yeast S. pombe by SpoMBEL1693 or its upgraded future 
version. 

Reconciliation of discrepancies between the in silico 
prediction results and experimental results showed that 
a majority of the reconciled metabolic reactions are 
those that were predicted to be false positives. The rec- 
onciliation of these false positive predictions was 
achieved through the linking of the metabolic pathways 
to cellular growth. This contributed to the improved ac- 
curacy in negative predictions by increasing the number 
of true negatives. Literature evidence supporting these 
modifications is lacking, and therefore are potential 
points of interest for further studies and characterization 
into the metabolism of S. pombe. Of the remaining false 
positives, many of the metabolic reactions displayed no 
flux in the metabolic model. This indicates the lack of 
specific characterization on the role of the metabolic re- 
action in the metabolism of the yeast. As many of these 
metabolic reactions are found in nucleotide metabolism 
or secondary metabolic pathways, it is likely that the an- 
notation of the genes for these enzymes is incomplete. 
Included in the false positive predictions were the results 
for which no experimental data or literature evidence, 
were available, and so whether they are truly false posi- 
tive prediction or true positive predictions was undeter- 
minable. Thus, they were included as the false predictions 
to highlight the need for experimental data for these 
genes. 

Reconciliation of false negative predictions required a dif- 
ferent approach from reconciling false positive predictions. 
Because the pathways were important for the synthesis of 
components used in the generation of biomass for cellular 
growth, alternate pathways would be required to bypass the 
deleted metabolic reaction and allow cellular growth. 
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However, localization data for the enzymes in S. pombe 
prevents the addition of metabolic reactions into compart- 
ments which the respective enzymes have not been loca- 
lized in. Thus, these alternate pathways that would allow 
cellular growth were made accessible through exchange 
reactions between the different compartments. This is 
shown in the case with Acetyl-CoA synthetase, where alter- 
nate pathways that synthesize acetyl- Co A in the mitochon- 
dria were made available to the cytoplasm through the 
exchange reaction for acetyl-CoA between the two com- 
partments. However, due to the absence of any known 
acetyl-CoA shuttle system, the discrepancy could not be re- 
solve with a high level of confidence. While this discrep- 
ancy remained unresolved, it did manage to deduce the 
presence of an uncharacterized transporter for acetyl-CoA 
across the mitochondrial membrane. Furthermore, in com- 
parison with the S. cerevisiae metabolic network, it was 
found that Acetyl-CoA synthetase is also found to be es- 
sential, but S. cerevisiae has two genes associate to the reac- 
tion: one essential and one non-essential gene. As S. 
cerevisiae possesses the carnitine-acetyl-CoA shuttle (ab- 
sent in the metabolic model of S. cerevisiae), this suggests a 
more in-depth study into the essential gene associated to 
Acetyl-CoA synthetase in S. cerevisiae. Other false negative 
predictions require additional information beyond what the 
single-gene knockout mutant library can provide. For in- 
stance, an essential metabolic reaction can be associated to 
two genes where one gene may compensate for the knock- 
out of the other and would require a double-gene knockout 
to determine the validity of the predicted in silico 
phenotype. 

Comparison of the RING analysis results on reconciling 
single-gene mutant phenotypes in S. pombe with the study 
of reconciling single-gene mutants of S. cerevisiae using 
GrowMatch, demonstrates the advantages of the flexibility 
RING brings to the process. It should be noted that while 
both approaches examined the problem of reconciling in 
silico predictions with in vivo observations at both the gene 
and reaction levels, the simulations done at the reaction 
level (RING) and simulations done at the gene level (Grow- 
Match), may not be directly comparable. Yet, using the 
RING strategy, we were able to resolve a higher proportion 
of false positive predictions as demonstrated by the higher 
specificity (i.e. false positives are lethal phenotypes pre- 
dicted to be viable). Also the proportion of viable pheno- 
types accurately predicted to be viable (sensitivity) in S. 
pombe is comparable, though slightly lower than the study 
with S. cerevisiae. Considering that the volume of know- 
ledge on S. cerevisiae is far greater than that on S. pombe, 
the results attain with RING is notable. 

Conclusions 

In this paper, we reported the reconstruction of the gen- 
ome-scale metabolic model of the fission yeast S. pombe 



SpoMBEL1693 and the strategy for refining the models 
ability to predict the growth phenotypes of the single- 
gene knockout mutants. An iterative process called 
RING was employed in reconciling false in silico predic- 
tions with experimentally observed phenotypes to im- 
prove the accuracy of the metabolic model by 21.5%. 
Despite the huge increase in accuracy of the metabolic 
model in predicting single-gene mutant phenotypes, un- 
resolved inconsistencies between in silico predictions 
and in vivo observations highlight the gaps in our know- 
ledge regarding the metabolism of S. pombe. The lack of 
literature evidence supporting the reconciled changes to 
the metabolic model based on our analysis highlights the 
gaps in our knowledge. Detailed characterization of GPR 
relationships specific to S. pombe would increase the 
confidence level in resolving the inconsistencies. Further- 
more double-gene mutant phenotypes would also aid in 
revolving many of the inconsistencies where gene dupli- 
cates exist. The SpoMBEL1693 metabolic model recon- 
structed and validated here is a first step towards 
enhancing our understanding of eukaryotic metabolism. 

Methods 

Model reconstruction 

The initial reconstruction of the metabolic model was 
performed using the set of biochemical reactions anno- 
tated from the genome and presented in the Kyoto 
Encyclopedia of Genes and Genomes [24], NCBI, and 
the S. pombe gene database on GeneDB [25]. Compart- 
ment assignment was also taken from previous reports 
where protein localization was determined experimen- 
tally [26,27]. Transport reactions were brought in from 
the TransportDB [33] (See Additional file 5). 

From the KEGG databases, the genomic information 
of S. pombe was downloaded and the gene information 
and the E.C. numbers assigned to the enzymes encoded 
by the respective genes were extracted. All metabolic 
reactions were collected and transferred into the meta- 
bolic model reconstruction (See Additional file 1). Water 
and hydroxyl ions were not balanced by assuming that 
there are other non-enzymatic functions in the cell that 
uses these molecules and therefore do not need to be 
balanced in the set composed of enzymatic reactions. 
Once the set of biochemical reactions has been collected, 
the list is curated for any inconsistencies or gaps in the 
network. 

Strains and culture conditions 

Cultures of the fission yeast were performed to obtain data 
utilized in the reconstruction of the genome-scale meta- 
bolic model. S. pombe was obtained from the DSMZ 
(DSM-70576) and was cultured in yeast nitrogen base 
media without amino acids to create a stock of the yeast 
stored at -80°C until thawed for fermentation and wet 
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experiments performed for the validation of the model S. 
pombe was cultured at 30°C. 

Batch cultures were carried out as follows. Seed cultures 
were prepared by transferring 500 uL of 10 mL overnight 
cultures prepared in yeast nitrogen based media without 
amino acids plus 10 g/L of glucose into 250 mL 
Erlenmeyer flask containing 100 mL of the same medium 
and incubated in a shaker at 30°C Cultured cells used to 
inoculate the fermenter containing 2 L of yeast nitrogen 
based media without amino acids medium containing 
20 g/L glucose at 30°C Batch culture was carried out in a 
6.6 L Bioflo 3000 fermenter (New Brunswick Scientific 
Co., Edison, NJ). The agitation speed was initially set at 
200 rpm and was increased accordingly using automatic 
controlling to maintain a dissolved oxygen concentration 
(DOC) at 40% of air saturation or greater. The pH was 
adjusted at 6.00 ±0.1 using 28% (v/v) ammonia solution. 
Foaming was controlled by the addition of Antifoam 289 
(Sigma, St. Louis, MO). Aeration was done at a flow rate 
of 0.25 wm during the whole period of fermentation. 

Samples for the measurement of amino acid compos- 
ition, to be used for cellular growth, was taken from the 
batch cultures during the exponential phase. Nine millili- 
ters of the culture was centrifuged and the supernatant 
was removed, leaving the cell pellet, which was used to 
analyze the amino acid composition (See analytical 
procedures). 

Analytical procedures 

Cell growth was monitored by measuring the absorbance at 
600 nm (OD 600 ) with an Ultrospec3000 spectrophotometer 
(Amersham Biosciences, Uppsala, Sweden). Cell concentra- 
tion defined as gram dry cell weight (gDCW) per liter was 
determined by using the correlation found in literature re- 
lating the OD 600 to dry weight (1 OD 600 = 0.62 gDCW/L). 
The concentrations of glucose and by-products in the 
media were determined by high-performance liquid chro- 
matography (Varian ProStar 210, Palo Alto, CA) equipped 
with UV/VIS (Varian ProStar 320, Palo Alto, CA) and RI 
(Shodex RI-71, Tokyo, Japan) detectors. A MetaCarb 87 H 
column (300 x 7.8 mm, Varian) was isocratically eluted with 
0.01 N H 2 S0 4 at 60°C and a flow rate of 0.6 mL/min. 

> Composition of the amino acids and fatty acids in S. 
pombe was determined from samples obtained from 
batch fermentations during the exponential growth 
phase in the yeast nitrogen base media without amino 
acids, containing 20 g/L glucose as a carbon source. 
Amino acid compositions were quantified using a Waters 
HPLC system (Waters Corporation, Milford, MA) which 
consists of two 510 HPLC pumps, a gradient controller, 
717 automatic sampler, 996 photodiode array detector, 
and a Millennium 32 chromatography manager together 
with Waters pico-tag column (3.9 x 300 mm). Absorb- 
ance at 254 nm was measured. Other components were 



adopted from the literature or assumed (See Additional 
file 2). 

In silico flux analysis 

For the analysis of the genome-scale metabolic model, in 
silico flux analysis was used where the internal metabo- 
lites were first balanced under the assumption of 
pseudo-steady state [34]. This resulted in a stoichiomet- 
ric model Sij.Vj = 0, in which Sij is a stoichiometric coeffi- 
cient of a metabolite i in the jth reaction and w } is the 
flux of the jth reaction given in mmol/gDCW/h. Linear 
programming (LP), subject to the constraints pertaining 
to mass conservation, reaction thermodynamics, and 
capacity, was carried out to determine the fluxes [35]. 
These constraints were presented in the forms of upper 
and lower bounds for the fluxes (vj >min < Vj < Vj >max ) for 
each reaction j, and used together with an objective func- 
tion Z, usually the growth rate [14,36]. 

Gene/reaction essentiality and mutant growth phenotype 
simulations were performed in GAMS: Integrate Develop- 
ment Environment using the CPLEX solver. Reaction 
knockout was simulated by constraining each flux to zero, 
while the objective function was set to maximize cellular 
growth. If the resulting cellular growth, or biomass forma- 
tion, was less than 10% of the "wild-type" value while the 
flux of the metabolic reaction was constrained to zero, then 
the deletion of the corresponding gene was considered to 
be lethal and the metabolic reaction to be essential. If no 
change to the cellular growth was observed or biomass for- 
mation was greater than 10% of the "wild-type" when the 
metabolic reaction was constrained to zero, then the result- 
ing growth phenotype was considered to be viable and the 
metabolic reaction to be non-essential. The media YES was 
used in the in silico simulations to mimic the growth condi- 
tions which the single-gene mutant library was con- 
ducted in and where nutrients found in yeast extract 
and adenine, histidine, leucine, uracil, and lysine [23] 
were unconstrained and glucose uptake rate was set 
to the experimentally determined value of 4.19 mmol 
glucose/gDCW/h. Additional compounds, such as 
iron, were also included to ensure cell growth rate. 

Additional files 



Additional file 1: List of metabolic reactions and metabolite 
abbreviations used in SpoMBEL1693. 

Additional file 2: SpoMBEL1693 Characteristics and biomass 
composition. 

Additional file 3: Additional validation studies for SpoMBEL1693 - 
Carbon source utilization and Flux Variability Analysis of ethanol 
production capacity. 

Additional file 4: List of False positive and False negative 
predictions from single knockout simulation using SpoMBEL1693. 

Additional file 5: SpoMBEL1693 in SBML format. 
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